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Many statistical data are imprecise due to factors such as mea- 
surement errors, computation errors, and lack of information. In such 
cases, data are better represented by intervals rather than by single 
numbers. Existing methods for analyzing interval- valued data include 
regressions in the metric space of intervals and symbolic data analysis, 
the latter being proposed in a more general setting. However, there 
has been a lack of literature on the distribution-based inferences for 
interval- valued data. In an attempt to fill this gap, we extend the con- 
cept of normality for random sets by Lyashenko (1983) and propose 
a normal hierarchical model for random intervals. In addition, we 
develop a minimum contrast estimator (MCE) for the model param- 
eters, which we show is both consistent and asymptotically normal. 
Simulation studies support our theoretical findings, and show very 
promising results. Finally, we successfully apply our model and MCE 
to a real dataset. 

1. Introduction. In classical statistics, it is often assumed that the 
outcome of an experiment is precise and the uncertainty of observations is 
solely due to randomness. Under this assumption, numerical data are rep- 
resented as collections of real numbers. In recent years, however, there has 
been increased interest in situations when exact outcomes of the experiment 
are very difficult or impossible to obtain, or to measure. The imprecise na- 
ture of the data thus collected is caused by various factors such as measure- 
ment errors, computational errors, loss or lack of information. Under such 
circumstances and, in general, any other circumstances such as grouping 
and censoring, when observations cannot be pinned down to single num- 
bers, data are better represented by intervals. Practical examples include 
interval-valued stock prices, oil prices, temperature data, medical records, 
mechanical measurements, among many others. 

The effort in the literature to analyze interval- valued data, while still at its 
early stage, shows promising advances. The earliest attempt probably dates 
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back to the 1990s, when Diamond published his paper on the least squares 
fitting of compact set-valued data and considered interval-valued input and 
output as a special case (see Diamond, 1990). Due to the embedding theo- 
rems started by Brunn and Minkowski and later refined by Radstrom (see 
Radstrom 1952) and Hormander (see Hormander 1954), /C(]R"), the space of 
all nonempty compact convex subsets of M"", is embedded into the Banach 
space of support functions. Diamond (1990) defined a L2 metric in this Ba- 
nach space of support functions, and found the regression coefficients by min- 
imizing the L2 metric of the sum of residuals. This idea was further studied 
in Gil et al. (2002), where the L2 metric was replaced by a generalized metric 
on the space of nonempty compact intervals, called "W-distance" , proposed 
earlier by Korner and Nather (1998). Separately, Billard and Diday (2003) 
introduced the central tendency and dispersion measures and developed the 
symbolic interval data analysis based on those. (See also Carvalho et al., 
2004.) However, none of the existing literature considered distributions of 
the random intervals and the corresponding statistical methods. 

It is well known that normality plays an important role in classical statis- 
tics. But the normal distribution for random sets remained undefined for 
a long time, until the 1980s when the concept of normality was first intro- 
duced for compact convex random sets in the Euclidean space by Lyashenko 
(1983). It is especially useful in deriving limit theorems for random sets. 
See, Puri et al. (1986), Norberg (1984), among others. Since a compact con- 
vex set in R is a closed bounded interval, by the definition of Lyashenko 
(1983), a normal random interval is simply a displacement of a fixed closed 
bounded interval. From the point of view of statistics, this is not enough to 
fully capture the randomness of a general random interval. 

In this paper, we extend the definition of normality given by Lyashenko 
(1983) and propose a normal hierarchical model for random intervals. With 
one more degree of freedom on "shape" , our model conveniently captures the 
entire randomness of random intervals via a few parameters. Therefore, it 
adds to the literature the possibility of distribution-based inference methods 
for interval-valued data. Especially, conditioning on the first hierarchy, our 
normal hierarchical random interval is exactly the normal random interval 
defined by Lyashenko (1983). This could be a very useful property in view 
of the limit theorems. In addition, with certain choices of the distributions, 
a linear combination of our normal hierarchical random intervals follows the 
same normal hierarchical distribution. An immediate consequence of this 
property is the possibility of a factor model for multi-dimensional random 
intervals, based on our normal hierarchical distribution, as the "factor" will 
have the same distribution as the original intervals. 
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To estimate the parameters and make inferences for our Normal hierar- 
chical model, we propose a minimum contrast estimator (MCE) based on 
the hitting function of the random interval. We show that under certain con- 
ditions the MCE satisfies a strong consistency and asymptotic normality. A 
simulation study is carried out for one specific distribution, and the results 
are consistent with our theorems. We apply our model to analyze a daily 
temperature range data and, in this context, we have derived interesting and 
promising results. 

The rest of the paper is organized as follows. Section 2 formally defines 
our Normal hierarchical model and discusses its statistical properties. Sec- 
tion 3 introduces a minimum contrast estimator for the model parameters, 
and presents its asymptotic properties. A simulation study is reported in 
Section 4, and a real data application is demonstrated in Section 5. We give 
concluding remarks in Section 6. Proofs of the theorems are presented in 
Section 7. Useful lemmas and other proofs are deferred to the Appendix. 

2. The Normal hierarchical model. 

2.1. Definition. Let be a probability space. Denote by IC the 

collection of all non-empty compact subsets of M*^. A random compact set is 
a Borel measurable function A : il. ^ IC, IC being equipped with the Borel 
(T-algebra induced by the Hausdorff metric. If A{u}) is convex for almost all 
uj, then A is called a random compact convex set. (See Molchanov 2005, 
p. 21, p. 102.) Denote by ICc the collection of all compact convex subsets of 
W^. By Theorem 1 of Lyashenko (1983), a compact convex random set A in 
the Euclidean space is Gaussian if and only if A can be represented as the 
Minkowski addition of a fixed compact convex set M and a d-dimensional 
normal random vector e, i.e. 



As pointed out in Lyashenko (1983), the Gaussian random set defined above 
is especially useful in view of the limit theorems discussed earlier in Lyashenko 
(1979). That is, if the conditions in those theorems are satisfied and the limit 
exists, then it is Gaussian in the sense of (1). Puri et al. (1986) extended 
these results to a separable Banach space. 

In the following, we will restrict ourselves to compact convex random sets 
in M^, that is, bounded closed random intervals. They will be called random 
intervals for ease of presentation. 

According to (1), a random interval A is Gaussian if and only if A is 
representable in the form 



(1) 



A = M + {e}. 



(2) 



A = I + {e} 
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where / is a fixed bounded closed interval and e is a normal random vari- 
able. Obviously, such a random interval is simply a displacement of a fixed 
interval, so it is not enough to fully capture the randomness of a general 
random interval. In order to model the randomness of both the location and 
the "shape" (length), we propose the following Normal hierarchical model 
for random intervals: 

(3) A = I + {e], 

(4) I = rih, 

where r] is another random variable and Iq is a fixed interval in M. Here, 
the product t^Iq is in the sense of scalar multiplication of a real number and 
a set. Let A(-) denote the Lebesgue measure of a compact convex subset of 
W^, which in the case d = 1 is the length of an interval. Then, 

A(^) = A(e + r?/o) = A(r?/o) = |r?|A(/o). 

That is, ?7 is the parameter that models the length of A. In particular, if 
r/ — )• 0, then A reduces to a normal random variable. 

An interesting property of the Normal hierarchical random interval is that 
its linear combination is still a Normal hierarchical random interval. This is 
seen by simply observing that 

n n n / ^ \ 

(5) ^ aiAi = ^ flj (ci + Vilo) = ^ CLiti + -^0 ^ aj-^/i , 

i=l i=l i=l \i=l J 

for arbitrary constants ai,i = 1, • • • where "+" denotes the Minkowski 
addition. This is very useful in developing a factor model for the analysis 
of multiple random intervals. Especially, if we assume rji ~ N{fii,af),i = 

n 

1, • • • , n, then the "factor" ^ CLiAi has exactly the same distribution as the 

i=l 

original random intervals. We will elaborate more on this issue in section 4. 

Without loss of generality, we can assume in the model (3)- (4) that Ee = 
0. We will make this assumption throughout the rest of the paper. 

2.2. Model properties. The fundamental theory of random sets was de- 
veloped in the 1960s and 1970s. See, e.g., Matheron (1967), Matheron (1975), 
and Kendall (1974). According to the Choquet theorem (Molchanov 2005, 
p. 10), the distribution of a random closed set (and random compact convex 
set as a special case) A, is completely characterized by the hitting function 
T defined as: 



(6) 
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T is also called the Choquet capacity functional. For compact convex sets, 
there is another functional, containment functional C{K) = P{A C K), 
\/K G /Cc, which also uniquely determines the distribution. But we are not 
considering C{K) in this paper. 

Writing Iq = [ao, h^] with oq < 60, the Normal hierarchical random inter- 
val in (3)-(4) has the following hitting function: V[a, 6] G /Cc, 

TA{[aM) 
= P{[a,b]r\A^$) 

= P([a, 6] n ^ / 0, r/ > 0) + P([a, 6] n ^ / 0, r/ < 0) 

= -P(a — rjh^ < € < b — r]ao,r] > 0) + P{a — r/oo < e < b — r]bQ,r] < 0). 

The expected value of a compact convex random set A is defined by the 
Aumann integral of a set-valued function (see Aumann 1965, Artstein and 
Vitale 1975) as 

EA = {E^, : £ A almost surely} . 
In particular, the Aumann expectation of a random interval A is given by 
(7) EA=[EAi,EAu], 

where Ai and A^ are the lower and upper bound of A respectively. Therefore, 
the Aumann expectation of the Normal hierarchical random interval A is 

EA = E{e + r]Io) = Ee + E{r]Io) = E{r]Io) 
= E{ [ao??, 6o??]^{;,>o) + [boV, aov]Iiri<o) } 
= E [ao??/(^>o) + 6o?/^{-;,<o) , ^o??^(,,>o) + aor]I{.n<o)] 
= [aQEr]+ + boEr]^ , boEr]+ + oo-E??-] , 

where 

Notice that ??+ can be interpreted as the positive part of r/, but 77_ is not 
the negative part of r/, as r/_ < when r/ < 0. 

The variance of a compact convex random set A in is defined via its 
support function. (See Korner 1995, 1997.) Let S'^~^ be the unit sphere in 
W^, and let /_i be the normalized {d — l)-dimensional Lebesgue measure on 
S'^~^ , i.e., iJ.{S'^~^) = 1. The support function s^(-) of A is defined as 

sa{u) = sup < u,a >,\/u €^ S'^~^. 
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A compact convex set A corresponds uniquely to its support function 
See Schneider (1993, p. 37) for example. Let ||-||2 denote the L2-metric in the 
space of Lebesgue square integrable functions on S"^~^. The 62 metric in )Cc 
is defined by 

62{A,B) = \\sA - sb\\2 = {^/, j \sa{u) - SB{u)\'^fi{du)y ,yA,B e Kc- 
Then the variance of A is defined as 



where EA is the Aumann expectation defined in (7). In the special case 
when d = 1, it is shown by straightforward calculations that 



for a random interval A. See, for example, Korner (1995). For the Normal 
hierarchical random interval A, 



Var{Ai) 
= Var{e + aor]+ + bor]_) 

= E{e + aor]+ + 60^?-)^ - [E {e + ao?7+ + ^o??-)]^ 
= Ee"^ + alVar{r]+) + blVar{rj^) 

+2 {aoEeri+ + h^Eer]^ - a^hQErj+Er]^) , 



Var{A^) 
= Var{e + bori+ + aor]^) 

= E{e + boi]+ + ao??-)^ - [E {e + 6or?+ + aor?-)]^ 
= Ee^ + blVar{r]+) + alVar{rj^) 

+2 {boEeri+ + aoEer]- - aoboEri+Er]^) . 



(8) 



Var{A) = E52iA,EA) 



(9) 




and 



The variance of A is then found to be 



Var{A) 



War{Ai) + ^Var{A^) 

Ee^ + ^{al + bl) [Var{rj+) + yar(7?_)] 



+(ao + bo)Eer] - 2aoboET]+T]-. 
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Remark. Prom the discussion earlier in this section, we see that for the 
Normal hierarchical model (3)- (4), e and rj are the "location" and "shape" 
parameters respectively. Therefore, in most cases, it is sufficient to assume 
that rj > 0. Under this assumption, we have rj^ = r] and ry_ = 0. Conse- 
quently, 

EA = Ei] [ao,6o], 

and 

Var{A) = Ee^ + ^{al + bl)Var{ri) + {ao + bo)Eeri 

= Var{e) + ^{al + bl)Var{'n) + (oq + bo)Cov{e, ry), 

with Ee = 0. 

3. The minimum contrast estimator. 

3.1. Definitions. We study the minimum contrast estimator (MCE) of 
the Normal hierarchical random interval (3)- (4), as well as its asymptotic 
properties. Since d = 1, from now on we let /C be the space of all non-empty 
compact subsets in M restrictively, and let be the Borel cj-algebra on /C 
induced by the Hausdorff metric. Let ICc denote the space of all non-empty 
compact convex subsets, i.e., bounded closed intervals, in M. As mentioned 
in the previous section, a random interval X is a Borel measurable function 
from a probability space ($7, C, P) to (/C, F) such that X S K,c almost surely. 

Throughout this section, we assume observing a sample of i.i.d. random 
intervals X[n) = {Xi, X2, • • • , X^}. Let 9 denote a p x 1 vector containing 
all the parameters in the model, which takes on a value from a parameter 
space G C W. Here p is the number of parameters. Let 6q denote the true 
value of the parameter vector. Denote by TQ{a, b) the hitting function of Xi 
with respect to 0,V[a,5] G K-c- 

In order to introduce the MCE, we will need some extra notations. Let X 
be a basic set and ^ be a cr-field over it. Let B denote a family of probability 
measures on (X,^) and r be a mapping from B to some topologial space T. 
t{P) denotes the parameter value pertaining to P, VP € B. The classical 
definition of MCE given in Pfanzagl (1969) is quoted below. 

Definition 1. [P/anza(7/(1969)] A family of A-measurable functions 
ft : X ^ M.,t & T is a family of contrast functions if 



(10) 



Ep [ft] < 00, 
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Vt £T,yP£ B, and 

(11) Ep [/^(p)] < Ep [ft] , 
yt€T,yP eB,t^ t{P). 

In other words, a contrast function is a measurable function of the random 
variable(s) whose expected value reaches its minimum under the probability 
measure that generates the random variable(s). From the view of probability, 
with the true parameters, a contract function tends to have a smaller value 
than with other parameters. The contrast function is an essential concept in 
classical statistics. For example, the negative log likelihood (and the negative 
log density when there is one single observation) is a contrast function, 
minimizing which leads to the well-known maximum likelihood estimation 
(MLE). 

Adopting some notations from Pfanzagl (1969), we let B denote a family 
of probability measures on {ICc,J~) and r be a mapping from B to some 
topologial space T. Similarly, r(P) denotes the parameter value pertaining 
to P, VP E B. We modify the notion of MCE in Heinrich (1993) according 
to the scenario of random intervals, and give our definition of contrast func- 
tion below. And then the MCE is defined as the minimizer of the contrast 
function. 

Definition 2. A family of J^' -measurable functions M{X{n); 0): fC^ — ^ 
[— oo, +oo], n G N, G is a family of contrast functions for B, if there 
exists a function N{-, •); x — )■ M such that 

(12) Peduj : lim M(X(n); C) = N{0, C)|) = 1, V 0, C G ©, 

and 

(13) N{e, 0) < N{e, c) V 0, c G 0, / C- 

Definition 3. A J"^ -measurable function On: IC^ — )■ t{B), which de- 
pends on X{n) only, is called a minimum contrast estimator (MCE) if 

(14) M{X{n); 0„) = inf {M{X{n); 0) : 6 e t{B)} . 

3.2. Theoretical results. We make the following assumptions to present 
the theoretical results in this section. 

Assumption 1. is compact, and 6q is an interior point ofQ. 
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Assumption 2. The model is identifiable. 

Assumption 3. Tg{-, •) is continuous with respect to 6. 

Assumption 4. ~de^{'i')j * = ^^iPj exist and are finite on a hounded 
region 5° C . 



Assumptions, ^{-r), mrnrX'^-)' .fJ'^. i-,-), i,j,k = 1, 



exist and are finite on 5" for G 0. 

Assumption 4 and 5 are essential to establish the asymptotic normality 
for the MCE 6^- It is rather mild and can be met by a large class of capacity 
functionals. For example, if S'^ is closed, then each Tq^ with continuous up 
to third order partial derivatives satisfy both assumptions, as a continuous 
function on a compact region is always bounded. The following theorem 
gives sufficient conditions under which the minumum contrast estimator 0„ 
defined above is strongly consistent. 

Theorem 1. Let M(X{n);d) be a contrast function as in Definition 2 
and let 6n be the corresponding MCE. Under the hypothesis of Assumption 
1 and in addition if M{X{n);6) is equicontinuous w.r.t. 9 for all X{n),n = 
1,2, - ■■ , then, 

On — )• ^0 O.S., as n —)• oo. 
Let [a, b] G ICc- Define an empirical estimator T(a, b; X{n)) for T(a, b) as: 

#{X, : [a,6]nX, /0,i = l,-- - ,n} 



(15) T{a,b;X{n)) 



n 



Extending the contrast function defined in Heinrich (1993) (for parameters 
in the Boolean model), we construct a family of functions: 

(16) H{X{n);G)= [ [ \T0{a,b) - f{a,b; X{n))Y W{a,b)dadb, 



for £ Q, where S C CM?, and W{a,b) is a weight function on [a,b] 
satisfying < VF(a, b) < C, V[a, b] G ICc- 

We show in the next Proposition that H(X{n);6), 6 £ & defined in 
(16) is a family of contrast functions for 6. This, together with Theorem 
1, immediately yields the strong consistency of the associated MCE. This 
result is summarized in Corollary 1. 
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Proposition 1. Consider that Assumption 2 and Assumption 3 are 
satisfied. Then H{X{n);6), 6 £@, as defined in (16), is a family of contrast 
functions with limiting function 



(17) 



N{e,0 = jj [Te{a,h)-T^{a,b)fW{a,h)dadh. 



In addition, H{X(n);6) is equicontinuous w.r.t. 6. 

Corollary 1. Consider that Assumption 1, Assumption 2, and As- 
sumption 3 are satisfied. Let H{X{n);0) be defined as in (16), and 



(18) 
Then 



= aigm.\n H iX in): e) 



Oq, U.S., 



as n —7- oo. 



Next, we show the asymptotic normahty for 0^ . As a preparation, we 
first prove the following proposition. The central limit theorem for 0^ is 
then presented afterwards. 

Proposition 2. Assume the conditions of Lemma 1 (in the Appendix). 
Define 



dH 



dH 



-I T 



_(X(„);«),^^ (X(„);« 



as the p X 1 gradient vector of H {X{n);6) w.r.t. 0. Then, 

'dH _ J p 



de 



{x{ny,e, 



N{0,E), 



where H is the p x p symmetric matrix with the {i,j) component 

E{i,j) = AjJJJ{P{Xin[a,b]y^i/},Xin[c,d]y^(/})-Te,{a,b)Tg,{c,d)} 
SxS 



(19) 



(a, b) (c, d) W(a, b)W(c, d)dadbdcdd. 
oOi odj 
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Theorem 2. Let H{X{n);6) be defined in (16) and 9^ be defined in 
(18). Assume the conditions of Corollary 1. If additionally Assumption 5 is 
satisfied, then 

(20) V^(C-^o) ^iv(o,c(rej-iHC(r«„)-i), 

where C{Toq) = 2jj (-||^) ("H^) {a,b)W{a,b)dadb, and E is defined in 



(19). 



e 


~ BVN ^ 


'0 


,s = 




0"12 






A 




0"12 






bo 




ao + 1. 







4. Simulation. We carry out a small simulation to investigate the per- 
formance of the MCE introduced in Definition 3. Assume, in the Normal 
hierarchical model (3)- (4), that 

(21) 

and 
(22) 

The bivariate normal distribution conveniently takes care of the variances 
and covariance of the location parameter e and the shape parameter rj. In 
addition, as seen in (5) in section 2, it is one distribution that makes the 
"factors" identically distributed as the observed random intervals in a factor 
model for multiple random intervals. The removal of the freedom of bo is 
for model identifiability purposes; it is seen that the hitting function is 
defined via rjao and ijbo only. For the simulation, we assign the following 
parameter values: 

ao = 1, /i = 20, S 10 1 



According to these values, P{r] < 0) 
function is approximately: 



1 10 • 

1.2698 X 10-1°. Therefore the hitting 



< 
< 



Te{[a,b]) 


P{a 


— i]b 


P{a 


— r]b 


P{a 


— i]b 


P{a 


— r]b 


"{ 


' 1 

-1 


< 


■ b ' 

— a 



vbo,V < 0) 



r]ao,r] > 0) 
7]ao,r] > 0) 

mo) 



P{ri < 0) 
in-io 



ao 
-ao - 





e 


< 


' b ' 










v. 




— a 





"o" 









.DT,D 
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where <I> (x; /x, ^1) is the bivariate normal cdf with mean /x and covariance 

'^^ , . This is another convenience offered by the 
-1 -ao-lj 

bivariate normal distribution. 

We simulate a random sample of size n from model (3)- (4) with the as- 
signed parameter values, and then compute the MCE's for the model pa- 
rameters based on the simulated sample. The process is repeated 10 times 
independently for each n, and we let n = 100, 200, 300, 400, 500, successively, 
to study the consistency and efficiency of the MCE's. The minimization is 
carried out in Matlab 2011a using the function fminsearch.m. Figure 1 shows 
one random sample of 100 observations generated from the model. We show 
the average biases and standard errors of the estimates as functions of the 
sample size in Figure 2 . Here, the average bias and standard error of the 
estimates of S are the L2 norms of the average bias and standard error ma- 
trices, respectively. As expected from Corollary 1 and Theorem 2, both the 
bias and the standard error reduce to as sample size grows to infinity. The 
numerical results are summarized in Table 1. 

Finally, we point out that the choice of the region of integration S is 
important. A larger S usually leads to more accurate estimates, but could 
also result in more computational complexity. We do not investigate this 
issue in this paper. However, based on our simulation experience, an S that 
covers most of the points (a, b) G R'^ such that [a, b] hits some of the ob- 
served intervals, is a good choice as a rule of thumb. In our simulation, 
E{A) ~ [20,40], by ignoring a small probability P{r] < 0). Therefore, we 
choose S = {{x — y,x + y) : 20 < x < 40, < y < 10}, and the estimates are 
satisfactory. 



A simulated random sample of Model (1 )-(3) 




Fig 1. Plot of a simulated sample from model (3)-(4) with n = 100. 
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Fig 2. Average bias and standard error of the MCE's for ao (top left), fi (top right), and 
E (bottom), as a function of the sample size n. 



5. A real data application. In this section, we apply our Normal 
hierarchical model and minimum contrast estimator to analyze the daily 
temperature range data. We consider two data sets containing ten years of 
daily minimum and maximum temperatures in January, in Granite Falls, 
Minnesota (latitude 44.81241, longitude 95.51389) from 1901 to 1910, and 
from 2001 to 2010, respectively. Each data set, therefore, is constituted of 
310 observations of the form: [minimum temperature, maximum tempera- 
ture] . We obtained these data from the National Weather Service, and all 
observations are in Fahrenheit. The plot of the data is shown in Figure 3 

Same as in the simulation, we assume a bivariate normal distribution for 
(e,r/) and /q = [ao^'^o + 1] has length 1. The minimum contrast estimates 
for the model parameters are: 



Data set 1 (1901-1910): 



ao 1 = 0.2495, fii = 19.8573, Si 



Data set 2 (2001-2010): 



207.1454 
-44.8547 



-44.8547 
102.5263 



ao,2 = 0.2614, fi2 = 20.4722, Ss 



318.9283 
-84.0892 



-84.0892 
68.4783 
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Table 1 

Average biases and standard errors of the MCE's of the model parameters in the 

simulation study. 



n 


ao 


=1 


tJ-= 


=20 


S 






bias 


ste 


bias 


ste 


bias 


ste 


100 


0.0683 


0.1289 


1.1648 


1.7784 


4.1166 


5.7951 


200 


0.0387 


0.0457 


0.4569 


0.5924 


3.8581 


4.0558 


300 


0.0274 


0.0326 


0.1831 


0.2598 


3.0317 


3.9042 


400 


0.0157 


0.0227 


0.1575 


0.2044 


2.8210 


3.5128 


500 


0.0128 


0.0161 


0.1197 


0.1790 


2.1494 


2.4973 




50 100 150 200 250 300 50 100 150 200 250 300 

Days Days 



Fig 3. Plots of daily January temperature range 1901-1910 (left) and 2001-2010 (right). 
The estimated mean is the interval between the two horizontal black lines, on each plot. 



Denote by Ai and A2 respectively the random intervals from which the 
two data sets are drawn. The estimated mean and variance for Ai and A2 
are fomid to be: 

E{Ai) = [4.8590,24.9071] ,Var{Ai) = 221.2313; 
E{A2) = [5.3335,25.8416] ,Var{A2) = 247.3275. 

Both mean and variance of the recent data are larger than those of the 
data 100 years ago. The two estimated means are also shown on the data 
plots in Figure 3. In addition, the correlation coefficient of (e, r/) is —0.3078 
for data set 1 and —0.5690 for data set 2, suggesting a negative correlation 
between the location and the length for the January temperature range data 
in general. That is, colder days tend to have larger temperature ranges, and, 
this relationship is stronger in the more recent data. 
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6. Conclusion. In this paper we introduced a new model of random 
sets (specifically for random intervals). In many practical situations data 
are not completely known, or are only known with some margins of error, 
and it is a very important issue to consider a model which extends normality 
for ordinary (numerical) data. Our hierarchical normal model extends nor- 
mality for point-valued random variables, and is quite flexible in the sense 
that it is well suited for both theoretical investigations and for simulations 
and real data analysis. To these goals we have defined a minimum contrast 
estimator for the model parameters, and we have proved its consistency and 
asymptotic normality. Our main contribution is: distribution-based estima- 
tion (as opposed to distance-based, metric space approach). We carry out 
simulation experiments, and, finally we apply our model to a real data set 
(daily temperature range data obtained from the National Weather Service) . 
Our approach is suitable for extensions to models in higher dimensions, e.g., 
a factor model for multiple random intervals, or more general random sets, 
including possible extensions to spherical random sets. 

7. Proofs. 

7.1. Proof of Theorem 1. Assume by contradiction that 6^ does not con- 
verge to 9q almost surely. Then, there exists an e > such that 



P(< w : limsup On{oj) — 



> e 



> 0. 



LetF 



lim sup„ 



> e} and A := Qn{e : \\0 - OqW > e}. 



By the compactness of A, for every u & F, there exists a convergent subse- 
quence (a;)| of |0„(a;)| such that 



^„,(w)^0GA, 



16 



Y. SUN AND D. RALESCU 



as i — )■ oo. Observe that 

liminf M(X(ni) ; 00 ) 

i—^oo 

> liminf A/(X(ni);^nJ 

= liminf (MiXim); 0„J - M{X{ni); 6) + M(X(n,); 0)| 

> liminf |M(X(ni);^„J -M(X(ni); 0)1 + liminf (M(X(n,); 0)1 

(23) = liminf I M(X(ni); 0)1 
= lim {M{X{ni)-~e)\ 
= N{6;eo). 

Equation (23) follows from the equicontinuity of M{X{n);9). 
On the other hand, 

liminf M(X(n,);0o) = hm M(X(ni);0o) = iV(6'o,6'o). 

Therefore, 

(24) P{[io : iV(0o, 0o) > iV(0(^), 0o)}) > 0, 

where E A and consequently 7^ 0o. But from the assumptions, N(9q, 9q) < 
N{9{u}),6q),\/uj. This contradicts (24), and therefore completes the proof. 

7.2. Proof of Theorem 2. From Taylor's Theorem, we have 

If (");«.?) = » 

{X in) ;9o) + Y, " ^0,.) {X (n) ; 0o) 



d9i 
1 

+- 



i=i 



E 



_d_ 



dH 



dejdOi 



{X (n);e. 



r) J-f 



(X(n);0o) + ^f;(^5-M 



1=1 



dOidOjde, 
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for i = 1, • • • ,p, where e„ lies between Oq and 6^ . Writing the above equa- 
tions in matrix form, we get 



dH 

'de 



(X(n);0o 



+ 



j=l V J V / 



= 0.(25) 

Observe, by taking derivatives under the integral sign, that Vi,j, 



H 



_d_ 



iX{n);9o) 



Tg{a, b) - T{a, b; X{n)) VF(a, 6)dad6, 



1 dT 

Tgia, b) - f{a, b; X{n)) -^{a, b)W{a, b)dadb, 



+2 
s 

:= / + //. 
The first term is 



Tgia,b)-f{a,b;X{n)) 
dTg, dTe, 



dejdOi 



{a,b)W{a,b)dadb 



{a,b)W{a,b)dadb 



{a,b)W{a,b)dadb 



-E / / [^^0 (a,6)-n(a,6)] 

k=l 

op{l), 



^^ia,b)Wia,b)dadb 



s 



according to the strong law of large numbers for i.i.d. random variables. 
Therefore, 



-Wn);.o) = op(l)+2//(^ 



dTg, dTe, 



de. 



{a,b)W{a,b)dadb, 
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In matrix form, 



d'^H 

(26) ^(X(n);0o) = op(l)+2 



dT0,\ (dTe, 



de 



89 



{a,b)W{a,b)dadb. 



Observe again that Vj, kj, 
d^H{X{n)-er,) 



< 2 



dOjdOkdei 

T,^ia,b)-f{a,b;X{n)) 



+ 



dOjdekdei 



{a,b)W{a,b)dadb 



< 4 



dOj dOkdeiJ \d9jdek 86 1 



+ 



{a,b)W{a,b)dadb 



dOjdekdei 



'^{a,b)W{a,b)dadb 



+2 



+ 



ddj dOkdej ydOjdek de, 



d^n^ dT, 



+ 



71 CjT, 



ddj 89 1 89 k 



{a,b)W{a,b)dadb 



Ci(e„) <C2, 



Ve„, G 0, by the compactness of 0. This, together with the strong consistency 
of gives 



J£(Cj-^Oj) (-. 



8 ( 8'^H 



3=1 

op(l), 



\89, \89k89i 



8^H{X{n); 



89S9k89i 



\/k,l. Equivalently, in matrix form. 



(27) I E Kj - 00,) (^) (Xin); = op(l). 

By the multivariate Slutsky's theorem, Proposition 2, together with equation 
(25), (26), and (27), yields the desired result. 
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8. Appendix. 



8.1. Proof of Proposition 1. Notice that T(a, b; X{n)) is the sample mean 
of i.i.d. random variables : — t- M defined as: 



(28) 



1, if X, n [a, 6] / 0, 
0, otherwise. 



Therefore, an application of the strong law of large numbers in the classical 
case yields: 

1 " 

_ ^y. '^4' EYi = P (Xi n [a, fe] / 0) = Tg^ (a, b) , as n ^ oo, 



Va, 6 : — oo < a < b < oo, and assuming 6q is the true parameter value. 
That is, 

f{a,b;X{n))n-Tg, {a,b) , 
as n — )• oo. It follows immediately that 



f{a,b-X{n))-Tg^^{a,b) VF(a,6)'^-0. 



Notice that Va, b : — oo < a < b < oo. 



f{a,b-X{n))-Tg^{a,b) W{a,b) is 



uniformly bounded by 4C By the bounded convergence theorem, 

2 



r(a, b; X{n)) - Tg,^ (a, 6)J W{a, b)dadb j j ^ ' dad6 = 0, 
given any S C with finite Lebesgue measure. This verifies that 



(29) 



Pg fa; : lim H (X(n); 0) = o| = 1. 

L n— )-oo J 



Similarly, we also get 
(30) 



Pe : Jim^-H'(X(n);C) = jj [Tg{a,b) - T^{a,b)f W{a,b)Aadb^ = I, 

yO,C G 0. Equations (29) and (30) together imply 

(31) N{e, = jj [Teia, b) - T^{a, b)f W{a, b)dadb, 6*, C G 6. 
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By Assumption 2, Tg{a,b) ^ T^{a,b), for 6 ^ except on a Lebesgue set 
of measure 0. This together with (31) gives 

Nie,e)<Nie,c), ve/c, ^,Cg©, 

which proves that H{X{n);9), 6 G Q is a family of contrast functions. To 
see the equicontinuity of H{X{n); 0), notice that V^i, 62 G O, we have 

\H{X{n)-ei)-H{Xin);e2)\ 

Tg^ {a, b) - f{a, b; X{n))) ^ W{a, b)dadb 



Te^ (a, b) - T{a, b; X{n)) Wia, b)dadb\ 



{Tg, (a, b) - Tg, {a, b)) [Tg, (r) + Tg, (a, b) - 2T{a, b; X{n)) ) W{a, b)dadb\ 
< iC [ [ \Tg^{a,b) -Tg^{a,b)\dadb. 



Then the equicontinuity oi H{X{n); 0) follows from the continuity of Tg{a^ b). 

8.2. Lemma 1. Let H(X{n);9) be the contrast function defined in (16). 
Under the hypothesis of Assumption 4, 



dH 

for i = 1, - ■ ■ ,p, where 



{X (n);0o) 



V 



N (0, Aj) , as n ^ 00, 



Ai = A jjjj {P{Xir\[a,b]^^,Xir\[c,d]^(k)-Tg,{a,b)Tg,{c,d)} 
SxS 

X (a, 6) (c, d) W{a, b)W{c, d)dadbdcdd. 

Oui Oui 

Proof. We will write ^'^g'g"'^^ = Tq^ {a,b) to simpUfy notations. Ex- 
changing differentiation and integration by the bounded convergence theo- 



A NORMAL HIERARCHICAL MODEL FOR RANDOM INTERVALS 



21 



rem, we get 

(32) ^ {X in) ■ eo) 



(T0,{a,b)-f{a,b;X{n))yW{a,b)dadb 



s 



— {To, (a, b)-f (a, 6; W{a, b)dadb 



s 



= II 2 (Tg, (a, b)-f (a, b; T^, (a, 6) W{a, b)dadb. 

s 

Define yi(a,6) as in (28). Then, 

(32) = ^y2|r0ja,6)-i^n(a,6)jr^Ja,6)t^(a,6)dad6 

g \ k=l / 

= - // E (^00 («> ^) - Yk {a, b)) (a, b) Wia, b)dadb 
(33) = liZ'^lliToo («, b) - Yk (a, 6)) T^^ (a, 6) Vr(a, 6)dad6 



k=l 



1 " 

n ^-^ 

k=l 

Notice that RkS are i.i.d. random variables: $7 — )• M. 

Let {Asi, As2, • • • , Asm} be a partition of S, and (oj, 6j) be any point in 
Asj, j = 1, • • • , m. Let A = maxi<j<m {diamAsj}. Denote by Auj the area 
of Asj. By the definition of the double integral, 

Rk = 2 II {Te, (a, b) - (a, b)) T^, (a, 6) Vt^(a, 6)dad6 



5 

m 



Jim ^ E (^''o ("i' ~ ("j' ^i)) («J' ^i) 
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Therefore, by the Lebesgue dominated convergence theorem, 
ERk 

I m 

= 2E hm \ (Too {aj ,bj) - ^ (a, , 6, ) ) T^^ {a^ , bj ) W{aj , bj)Aaj 



2 lim ^ J] [i? {Tg, {aj,bj) - (a„ b,))] Tg^ {a„b,) W{a„ b,)Aa, 



EO =0- 



A->0 . 
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Var{Rk) = ERl 



4E < 



lim < 
A-j-0 



mi 



4E lim lim < V (Te„ 



^ji ' ^ji ) [o-ji ) bj-^ 



^ (Te^ (aj2, ftjz) - ^fc (ai2> ^j2)) («i2' ^2) ^(ai2> ^^2)^0-^2 

,J2 = 1 

mi m2 

4£; lim^lim V > (Te J ^fc (oji , bj^)) {Tea (aj2> ^) - ^fc (aj2> ^)) 

Ai— j'UA2— >U 

jl=l j2 = l 
mi m2 

4 lim^limV >i?(Te„( )) (Too ^) - ^k (cLj^ibj^)) 

Ai— >UA2— >U — 
,71=1 .'/2=1 

mi m2 

4 lim lim V V Cov (Ffc (a^, , bj^ ) , (0^3 , bj^ ) ) 
Ai— S-0A2— s>0 — ■ — ' 

Jl = l J2 = l 

' ^ii ) («i2 > ^ ) W{aj^ , bj,)W{aj^ , bj^)Aaj^ Aaj^ 
i I I I I Cov (Yk (a, 6) , Yk (c, d)) T^^ (a, 6) T^^ (c, d) W{a, b)W{c, d)dadbdcdd 

{P {Xk n [o, b] / 0, Xfe n [c, d] / 0) - T(,„ (a, 6) T^, (c, d)} 

T^j^ (a, b) (c, d) VF(a, b)W{c, d)dadbdcdd. 

From the central limit theorem for i.i.d. random variables, the desired result 
follows. □ 

8.3. Proof of Proposition 2. By the Cramer- Wold device, it suffices to 
prove 




SxS 



1=1 



(34) ^^A,||(X(n);0o)^iv(o, ^^>^Mhj) 

' i<i,j<p 
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for arbitrary real numbers Aj, i = 1, • • • ,p. It is easily seen from (33) in the 
proof of Lemma 1 that 



' dH 



Y,\^—{x{n)-e,) 



. , do, 

1 = 1 

= ^ E 2 E // (^«o («, b) - Yk {a, h)) ^ (a, h) W{a, h)dadh 
^ k=i \ i=i g * y 

k=l \ i=l J 

By Lemma 1, 

(p \ p 

''-Y.^^Ql] =2^A,-0 = 0. 
1=1 J i=l 

In view of the central limit theorem for i.i.d. random variables, (34) is re- 
duced to proving 

(35) Vari2Y,KQl\= ^ ^^^J^(^'j)- 

By a similar argument as in Lemma 1, together with some algebraic calcu- 
lations, we obtain 

Var (^2EA,Q^^ 
= 4 h\,Cov(QlQi) 

l<i,j<p 

= 4 XiX.E I ff {Te,{a,b)-Yk{a,b))^{a,b)W{a,b)dadb 

l<i,j<P \s 

{Te, (a, b) - Yk (a, b)) ^ (a, b) W{a, b)dadb 
= 4 ^ XiXj Jill {P{Xin[a,b]^il},Xin[c,d]^il})-TeAa,b)Te,{c,d)} 

l<i,j<P sxS 

^ (a, b) ^ (c, d) W{a, b)W{c, d)dadbdcdd. 

OUi OUj 

This validates (35), and hence finishes the proof. 
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